XPEDIA: XML ProcEssing for Data IntegrAtion
نویسندگان
چکیده
Data Integration engines increasingly need to provide sophisticated processing options for XML data. In the past, it was adequate for these engines to support basic shredding and XML generation capabilities. However, with the steady growth of XML in applications and databases, integration platforms need to provide more direct operations on XML as well as improve the scalability and efficiency of these operations. In this paper, we describe a robust and comprehensive framework for performing Extract-Transform-Load (ETL) of XML. This includes (i) full computational model and engine capabilities to perform these operations in an ETL flow, (ii) an approach to pushing down XML operations into a database engine capable of supporting XML processing, and (iii) methods to apply partitioning techniques to provide scalable, parallel processing for large XML documents. We describe experimental results showing the effectiveness of these techniques.
منابع مشابه
Grid Data Integration Based on Schema Mapping
Data integration is the flexible and managed federation, analysis, and processing of data from different distributed sources. Data integration is a key issue for exploiting the availability of large, heterogeneous, distributed and highly dynamic data volumes on Grids. This paper presents a framework for integrating heterogeneous XML data sources distributed among the nodes of a Grid. We present...
متن کاملQuerying Semi-structured Data with Mutual Exclusion
Data analytics applications, content-based collaborative platforms and office applications require the integration and management of current and historical data from heterogeneous sources. XML is a standard data format for information. Thanks to its semi-structured-ness, it is a good candidate data model for the integration and management of heterogeneous content. However, the management of his...
متن کاملConverting XML Data To UML Diagrams For Conceptual Data Integration
The demand for data integration is rapidly becoming larger as more and more information sources appear in modern enterprises. In many situations a logical (rather than physical) integration of data is preferable since some data is inherently not suited for storing in a physically integrated data warehouse. Previous web-based data integration efforts have focused almost exclusively on the logica...
متن کاملA Model-Based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems
The demand for data integration is rapidly becoming larger as more and more information sources appear in modern enterprises. Extensible Mark-up Language is fast becoming the new standard for data representation and exchange on the World Wide Web, e.g., in B2B e-commerce, making it necessary for data analysis tools to handle XML data as well as traditional data formats. This paper presents arch...
متن کاملTowards Linked Data based Enterprise Information Integration
Data integration in large enterprises is a crucial but at the same time costly, long lasting and challenging problem. In the last decade, the prevalent data integration approaches were primarily based on XML, Web Services and Service Oriented Architectures (SOA). We argue that classic SOA architectures may be well-suited for transaction processing, however more efficient technologies can be emp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 2 شماره
صفحات -
تاریخ انتشار 2009